-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add event tracing and ETDumps to executor_runner #5027
Add event tracing and ETDumps to executor_runner #5027
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5027
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 662cb81 with merge base 7bc06d1 (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot label 'partner: arm' |
@pytorchbot label ciflow/trunk |
Can't add following labels to PR: ciflow/trunk. Please ping one of the reviewers for help. |
Hi @GregoryComer. Would it be possible to run the CI on your side to see if the issue from the previous PR is still occurring? I'm having a hard time understanding where this comes from. |
@digantdesai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
2 similar comments
@digantdesai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@digantdesai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
- Enabled via EXECUTORCH_ENABLE_EVENT_TRACER - Add flag 'etdump_path' to specify the file path for the ETDump file - Add flag 'num_executions' for number of iterations to run - Create and pass event tracer 'ETDumpGen' - Save ETDump to disk - Update docs to reflect the changes Signed-off-by: Benjamin Klimczak <benjamin.klimczak@arm.com> Change-Id: I7e8e8b7f21453bb8d88fa2b9c2ef66c532f3ea46
3288eda
to
b09d09e
Compare
Hi @dbort . Sorry for dragging you into this, but I saw your comment on |
I don't see a CI failure anymore |
@digantdesai To me |
@digantdesai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Yeah I see, from the main CMakeList, and qnn does have |
Any update on this? |
Hi @digantdesai, I'm still hoping for some pointer from @dbort or you as I'm struggling to reproduce it locally and can't really make sense of the error. |
@digantdesai will you have a look at this one since it touches code outside arm delegate. Thx! |
The error shows up when running this script If you have a linux machine, can you follow https://pytorch.org/executorch/stable/build-run-qualcomm-ai-engine-direct-backend.html and see if the script fails? |
@cccclai I finally managed to reproduce the issue by running script |
- Raise a CMake error if event tracing is enabled without the devtools - Re-factoring of the changes in the portable executor_runner - Minor fix in docs Change-Id: Ia50fef8172f678f9cbe2b33e2178780ff983f335 Signed-off-by: Benjamin Klimczak <benjamin.klimczak@arm.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review! All issues be fixed now.
Change-Id: I0ebb22636cdd64aea24bcee51cba05496ed78b1f
Change-Id: I7d72e4d8f46ec727a60c9553851d5b71da8e91d4 Signed-off-by: Benjamin Klimczak <benjamin.klimczak@arm.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for refactoring executor_runner, it looks great. Remaining issue is the flatccrt dep
Signed-off-by: Benjamin Klimczak <benjamin.klimczak@arm.com> Change-Id: I5e7d8ef5d66bc3d5de36ea451b31fb3bdcd42d09
@dbort has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Thanks for updating the dependency, and again I apologize for how long this has taken me to review. I'm running internal tests now and should be able to merge this soon. |
Thanks @dbort ! There is a linker error remaining in the qnn tests that I could not reproduce locally and I don't understand where it is coming from. Maybe you have an idea... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I see the failure
https://github.com/pytorch/executorch/actions/runs/12866353984/job/35868756676?pr=5027#step:14:34489
/usr/bin/ld: CMakeFiles/executor_runner.dir/examples/portable/executor_runner/executor_runner.cpp.o: in function `main':
executor_runner.cpp:(.text.main+0x5f9): undefined reference to `executorch::etdump::ETDumpGen::ETDumpGen(executorch::runtime::Span<unsigned char>)'
/usr/bin/ld: executor_runner.cpp:(.text.main+0x95d): undefined reference to `executorch::etdump::ETDumpGen::get_etdump_data()'
The build output says "EXECUTORCH_BUILD_DEVTOOLS : OFF", but the linker error implies that ET_EVENT_TRACER_ENABLED
is defined.
It does seem like the block you added to the top-level CMakeLists.txt should have triggered a SEND_ERROR in this case.
For debugging, it would help to print EXECUTORCH_ENABLE_EVENT_TRACER
in Utils.cmake.
And to reproduce, you might be able to use the cmake
command from the log:
https://github.com/pytorch/executorch/actions/runs/12866353984/job/35868756676?pr=5027#step:14:33817
cmake -DCMAKE_INSTALL_PREFIX=cmake-out -DCMAKE_BUILD_TYPE=Release -DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON -DEXECUTORCH_BUILD_KERNELS_CUSTOM=OFF -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON -DEXECUTORCH_BUILD_XNNPACK=OFF -DEXECUTORCH_BUILD_MPS=OFF -DEXECUTORCH_BUILD_COREML=OFF -DEXECUTORCH_BUILD_QNN=ON -DQNN_SDK_ROOT=/tmp/qnn/2.28.0.241029 -DPYTHON_EXECUTABLE=python -Bcmake-out .
cmake --build cmake-out -j9 --target install --config Release
rather than waiting for the full CI to run. Could try removing the -DQNN_SDK_ROOT=/tmp/qnn/2.28.0.241029
part since it doesn't seem like it would affect this failure.
- Remove explicit addition of `-DET_EVENT_TRACER_ENABLED` from backends/qualcomm/CMakeLists.txt as setting the definition without enabling cmake flag `EXECUTORCH_ENABLE_EVENT_TRACER` caused issues when building the executor_runner. - Replace deprecated namespace `torch::executor` with `executorch::etdump` in the executor_runner.cpp. Signed-off-by: Benjamin Klimczak <benjamin.klimczak@arm.com> Change-Id: Iadff38374e661f42e394dc69903548922ca08aea
Thanks @dbort ! I managed to find and fix the issue by following your pointers a little further. I think the remaining CI failures are not related to my change. The problem was that the QNN backend always set the definition The fix: I removed the definition in the QNN backend here so that the behavior should now be fully controlled by the CMake flags. |
# | ||
# add compile option | ||
# | ||
target_compile_options(executorch PUBLIC -DET_EVENT_TRACER_ENABLED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shewu-quic you added this line in https://github.com/pytorch/executorch/pull/2227/files#diff-0f6d37c62838592cf30db121c46cf34a9b9316df4b5a0d98f0ad7c7a98f7ff7eR261
It's currently causing problems because ET_EVENT_TRACER_ENABLED
only works when EXECUTORCH_BUILD_DEVTOOLS
is also enabled. Is it ok to remove this, or will other scripts/docs need to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be automatically set by the main CMakeLists.txt
as long as the CMake flag EXECUTORCH_ENABLE_EVENT_TRACER
is enabled.
@benkli01 Thank you for tracking down the problem with the Qualcomm jobs! @shewu-quic @cccclai please let us know if the change to qualcomm/CMakeLists.txt is ok. |
@dbort has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the CI jobs look good, I'm fine with merging this. Thanks so much for taking the time to figure all of this out!
Re-upload of #4502 to discuss with @GregoryComer.